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ABSTRACT 

The recently introduced collaborative nonnegative matrix factoriza¬ 
tion (CoNMF) algorithm was conceived to simultaneously estimate 
the number of endmembers, the mixing matrix, and the fractional 
abundances from hyperspectral linear mixtures. This paper intro¬ 
duces R-CoNMF, which is a robust version of CoNMF. The robust¬ 
ness has been added by a) including a volume regularizer which 
penalizes the distance to a mixing matrix inferred by a pure pixel 
algorithm; and by b) introducing a new proximal alternating opti¬ 
mization (PAO) algorithm for which convergence to a critical point 
is guaranteed. Our experimental results indicate that R-CoNMF pro¬ 
vides effective estimates both when the number of endmembers are 
unknown and when they are known. 

Index Terms — Hyperspectral imaging, spectral unmixing, end- 
member extraction, collaborative nonnegative matrix factorization 
(CoNMF). 

1. INTRODUCTION 

Spectral unmixing is one of the most active topics for remotely 
sensed hyperspectral data exploitation EM- In general, most 
methods use linear mixture model (LMM) due to its simplicity, 
where LMM assumes that each observed pixel in a hyperspectral 
image is linearly combined by a set of pure spectra or endmem¬ 
bers. Within the LMM paradigm, three main types of unmixing 
algorithms can be identified: geometrical, statistical, and sparse 
regression-based 0. Geometrical unmixing algorithms work under 
the assumption that the endmembers of a hyperspectral image are 
the vertices of a simplex of minimum volume enclosing the dataset 
(i.e., the set of hyperspectral vectors) or of a simplex of maximum 
volume contained in the convex hull of the dataset 0, such as the 
minimum-volume enclosing simplex (MVES) 0, minimum volume 
simplex analysis (MVSA) 0, the successive volume maximization 
(SVMAX) 0, and etc. Among minimum volume algorithms, we 
highlight the minimum volume constrained nonnegative matrix fac¬ 
torization (MVC-NMF CD) which is also based on NMF principles. 
As their name suggests, statistical methods m are based on analyz¬ 
ing mixed pixels by means of statistical principles, such as Bayesian 
approaches mm Finally, sparse regression-based algorithms CD 
are based on expressing each mixed pixel in a scene as a linear 
combination of a finite set of pure spectral signatures that are known 
a priori and available in a library. Although each of these methods 
exhibits their own advantages and disadvantages, the fact is that 
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geometrical approaches have been the ones most frequently used by 
the hyperspectral research community up to now 0. This is mainly 
due to their reduced -although still quite high- computational cost 
when compared with the other types of unmixing algorithms, as well 
as to the fact that they represent a straightforward interpretation of 
the LMM. In order to fully unmix a given hyperspectral image by 
means of a geometrical method, the majority of the state-of-the-art 
approaches are based on dividing the whole process into three con¬ 
catenated steps 1) estimation of the number of endmembers; 2) 
identification of the spectral signatures of these endmembers; and 
3) estimation of the endmember abundances in each pixel of the 
scene. In the last few years, several techniques have been devel¬ 
oped for addressing each part of the chain, with particular emphasis 
on the identification of endmembers (with and without assuming 
the presence of pure spectral signatures in the input hyperspectral 
data OH)- 

The recently introduced CoNMF algorithm El addresses si¬ 
multaneously the three unmixing stages. In addition to a quadratic 
data term, CoMNF uses two regularizers: a) the £ 2,1 mixed norm 
CCD applied to the abundance matrix, which promotes sparsity 
among the rows of that matrix, and, therefore, it selects the active 
endmembers, and b) a volume regularizer which simultaneously 
promotes minimum volume and pushes the endmembers with zero 
abundances toward the mean value of the dataset. In spite of the 
good results obtained with CoNMF, it has two weaknesses: a) there 
is no guarantee of convergence; b) in some cases, we observe a neg¬ 
ative joint effect of the volume and mixed norm regularizers. In this 
paper, we modify CoMNF aiming at removing those weaknesses. 
The former is removed by introducing a PAO solver to compute a 
critical point of the objective function. The latter is removed by 
modifying the volume regularizer aiming at robustness to noise and 
degenerated simplexes as often happens with real data and relatively 
high number of endmembers. 

The remainder of this paper is organized as follows. Section [2| 
introduces R-CoNMF. Section [3] presents results based on synthetic 
hyperspectral data sets. Comparisons with the MVC-NMF are also 
included. Finally, Section 0| draws some conclusions and hints at 
plausible future research lines. 

2. THE PROPOSED APPROACH 

Let Y = [yi,..., y n ] £ R dXn be matrix representation of a hyper¬ 
spectral image with n spectral vectors and d spectral bands. Under 
the LMM, we have 0 : 

Y = MS + N (1) 

s.t.: S > 0, l£s = 1 p 




Fig. 1. Illustration of the concept of row (collaborative) sparsity 
promoted £ 2,1 norm under the linear mixing model. The abundance 
matrix S is formed my the nonzero rows of X, while the mixing 
matrix M is formed by the correspondent columns of A. 

where M = [mi,..., m p ] G R dXp is a so-called mixing matrix con¬ 
taining p endmembers, m ? ; denotes the z-th endmember signature, 
S = [si,..., s n ] G R pXn is the abundance matrix containing the 
endmember fractions s*, for pixels i — 1,..., n, S > 0 is the abun¬ 
dance non-negativity constraint (to be understood in component¬ 
wise sense), and lJS = l p is the sum-to-one constraint (l p stands 
for a column vector of ones with p elements). Finally, N collects the 
errors that may affect the measurement process (e.g., noise). 

2.1. Estimation criterion 

In this work, we address the estimation of p, i.e, the number of end- 
members, as well as the estimation of the mixing matrix M, and the 
abundance matrix S. Although p is not known beforehand, we as¬ 
sume that we have access to an overestimate thereof. That is, we are 
given a number q such that q > p. In this way, we account for a 
common situation in which an overestimate of the number of end- 
members is easy to compute, which is not usually the case regarding 
the true number of endmembers. We tackle the estimation of p , M, 
and S by seeking a solution for the following NMF optimization: 

min (1/2) ||Y - AX\\ 2 F + a ||X|| 2jl + f ||A - P|| 2 F (2) 

S.t.: X G Sq — 1 A G Aq — 1, 

where ||-|| F stands for the Frobenius norm, A = [ai,... ,a g ] G 
R dXq and X G R qXn (x 1 and Xj will denote, respectively, the z- 
th row, and the z-th column of X) are optimization variables, linked 
with the mixing matrix M and the abundance matrix S, respectively. 
The relation between (A, X) and (M, S) is illustrated in Fig. [l] the 
term ||X|| 2 1 = Yli=i || x *|| 2 denotes the £ 2,1 (see, e.g., |'14J) mixed 
norm of matrix X, the term ||A — P||^ denotes the minimum vol¬ 
ume regularizer, P = [y^,..., y iq ] is a set of q spectral observed 
vectors inferred with a pure-pixel algorithm, thus close to the ex¬ 
tremes of the simplex, a and /3 are regularization parameters, S q -1 
is the collection of q x n matrices whose columns belong to the 
probability simplex of dimension q — 1, and A q -1 is the collection 
of matrices of size dx q whose colums belong to the the affine set of 
dimension q — 1 that best represents the data Y in the mean square 
error sense. The introduction of this constraint removes the short¬ 
comings associated violations to the sum-to-one constraints usually 
observed in real datasets. 

The objective function shown in 0 has three terms: a data fi¬ 
delity term ||Y — AX||^, which promotes solutions with low recon¬ 
struction error; an £ 2,1 mixed norm ||X|| 2 1 , which promotes row 
sparsity on X fl5l (that is, it promotes solutions with complete rows 
x 1 set to zero); and a minimum volume term ||A — P||^, which 
pushes the simplex defined by A towards the simplex defined by 
P which, given that the it is defined by observed vectors, is inside 


the true simplex. This volume regularizer is a fundamental device of 
R-CoNMF as it largely reduces the sensitive of the solutions of (13 
due to bad conditioning of the true mixing matrix M and to pertur¬ 
bations of the samples close to the simplex facets. 

2.2. Optimization algorithm and convergence 

The nonconvex data fidelity term (1/2) ||Y — AX||^ present in (|2l) 
makes the respective optimization hard. Herein, we adopt the fol¬ 
lowing PAO iterative algorithm: 

A ( t+i) = arg min L(A, X (t) ) + ^ ||A - A (t) |||, (3) 

AG^4q_i Z 

X ( t+i) = arg min L(A (t+1) ,X) + £||X - X (t) ||J, (4) 

S£Oq_i Z 

where L(A,X) is the objective function shown in (|3 and A,/z > 
0 are two positive constants. We remark that the above procedure 
can be interpreted as a regularized version of a two block non-linear 
Gauss-Seidel method lH6l . The POA algorithm ©-(HJ) is an instance 
of the class considered in El for which the convergence to a critical 
point is proved in the Theorem 9 of the same paper. Optimization © 
is a small size quadratic problem, thus very light, and optimization 
© is a constrained £2 — £ 2,1 problem which we solve effectively 
with the CSUNSAL algorithm fl8l . 

2.3. Applying R-CoNMF 

R-CoNMF may be applied either assuming that a) the number of 
endmembers is know or b) unknown. In the former case, we set 
q — p and a to a very small value, thus removing the £ 2,1 regular¬ 
izer from the objective function, although keeping the constraint set 
S q — 1 . In the latter case, we first apply R-CoNMF to infer the num¬ 
ber of endmembers and then we apply again R-CoNMF as described 
in scenario a). 

Let us consider the scenario b). In this case, we run R-CoNMF 
for a fixed q > p. Lets us define £(z) = ||x z || 2 , for i — 1,..., q, 
as a measure of the sparsity of abundances associated with the cor¬ 
responding endmembers a^, for i = 1,..., q. Ideally, we should 
have £(z) = 0 if a i is not active. Due to the impact or noise and 
model errors, we relax that criterion as follows: we consider that 
an endmember a i is active, if £(z) > £, for a small £ > 0. Fig. 
[3 shows the obtained £ and the reconstruction error for a problem 
with p = 6, n = 4000, and zero-mean Gaussian iid noise with 
SNR= 30dB, where SNR = 101og 10 (E[||MS||^/E||N|||). The 
applications of the above criterion with any value of £ between 0.5 
and 4 yields p = 6, which is the correct estimate of the number of 
endmembers. Notice that, this number could also be obtained from 
the plot of the reconstruction error. In more complex scenarios with 
lower SNR and a larger number of endmembers, we may devise a 
strategy to combine both indicators. 

3. EXPERIMENTAL RESULTS 

In this section, we evaluate the proposed R-CoNMF method us¬ 
ing synthetic hyperspectral data. The advantage of using synthetic 
scenes is that they offer a fully controlled analysis scenario in which 
the properties of our algorithm can be investigated precisely. The 
synthetic scenes have been generated using the LMM in 0. The 
scenes comprise n — 4000 pixels, and the spectral signatures used 
for their generation were randomly selected from the United States 
Geological Survey (USGS) digital spectral library!]]- In order to en- 

1 Available online: http://speclab.cr.usgs.gov/spectral-lib.html 












































(a) (b) 

Fig. 2. (a) Reconstruction error as a function of the estimated num¬ 
ber of endmembers. (b) Degree of sparseness £ for q = 15. 


sure the difference among the endmembers used for simulation pur¬ 
poses, the spectral angle distance (SAD) between any two spectral 
signatures is bigger than 10 degrees. Furthermore, let/w be number 
of endmembers in one pixel. In real scenarios, it is possible to have a 
large number of endmembers in a scene, for instance, p > 10. How¬ 
ever, it is unlikely to have a large size of endmembers in one pixel. 
That is, in general, p m i x is relatively small, say p m ix < 5. Based on 
this observation, for the simulated data if p > 5, we set /w = 5. 
Otherwise, if p < 5, we set p m ix = p . Finally, to ensure that no pure 
pixels are present in the synthetic images, we discard all pixels with 
abundance fractions larger than 0.8, i.e., max(si) < 0.8. 

In the case of q = p, let M = A and S = X denote the esti¬ 
mates of M and S, respectively. As performance indicators, we use 
the relative reconstruction error RRE = ||Y — MS|||r/||Y||F,the 
SAD (in degrees), and two error metrics focused on evaluating the 
quality of the estimated endmembers: ||M — M||f, and the qual¬ 
ity of the estimated abundances: ==||S — S\\f- It should be 

noted that, in all our experiments, we projected the data into a q- 
dimensional subspace with q > p. In Subsection 13.11 where the 
number of endmembers is assumed unknown, we set a — 10 -5 and 
/3 — 10“*. In Subsection 13.21 where the number of endmembers is 
assumed known, we set /3 = 10“ 5 and a was hand tuned for optimal 
performance. 


3.1. Experiment 1 




Fig. 3. Estimated fractional abundance matrix X for problems with 
n — 4000 pixels and SNR=30dB. 



The first experiment aims at showing the good capability of R- 
CoNMF to provide the correct number of endmembers. Fig. [3] 
shows the obtained fractional abundance matrix for four different 
problems with p = 6 and p — 10 endmembers, respectively, by 
using different values of q. In all cases, we considered a scenario 
with SNR=30dB. Let q be the number of endmembers estimated by 
R-CoNMF. In Figs. [3 a) and(3jb), where the number of endmembers 
is relative small (i.e., p — 6), it is easy to detect the real number 
of endmembers, that is, p = 6. When the number of endmember 
increases, as shown in Figs. [3]c) and[3jd), the R-CoNMF can still 
provide a good estimate p — 10 even though the problem is much 
more difficult in this case. In this experiment R-CoNMF provides 
effective estimates and works according to our expectation. How¬ 
ever, a more detailed evaluation demands extensive experiments 
with simulated and real data. 


3.2. Experiment 2 

In a second experiment we evaluate the performance of R-CoNMF 
in the case that the number of estimated endmembers coincides with 
the number of real endmembers, i.e., q — p. Here, we use the MYC- 


Fig. 4. Spectral signatures of the endmembers extracted by R- 
CoNMF and MVC-NMF as compared to the reference signatures 
used for the simulation of a synthetic scene with q — p — 6 and 
SNR=30dB. 



Fig. 5. (a) Difference between real and estimated abundances for the 
R-CoNMF algorithm, (b) Difference between the real and estimated 
abundances for the MVC-NMF algorithm. 


































































Table 1. Evaluation of the performance of R-CoNMF and MVC-NMF in the unmixing of a synthetic hyperspectral data set, simulated with 
SNR=30dB, for different numbers of endmembers and q — p, where means no results. 



R-CoNMF 

MVC-NMF 

q = p 

1 M M F 

* ||S S|| F 

vn Xp" 11 

SAD 

RRE 

1 M M ,, 

* ||S S|| F 

Vn Xp" 11 

SAD 

RRE 

4 

0.17±0.09 

0.01±3e-3 

0.64±0.46 

0.03±0.0 

0.84±0.47 

0.03±0.02 

2.54±1.73 

0.20±0.05 

6 

0.20±0.08 

0.01±4e-3 

0.56±0.21 

0.03±le-4 

1.44±1.17 

0.04±0.02 

3.48±2.57 

0.23±0.04 

8 

0.76±0.23 

0.02±0.01 

1.89±0.67 

0.03±le-3 

36.50±154.23 

0.08±0.05 

11.97±19.48 

0.25±0.07 

10 

1.28±0.50 

0.03±0.01 

2.77±1.73 

0.03±le-3 

- 

- 

- 

- 

15 

3.13±1.81 

0.05±0.01 

5.25±3.87 

0.04±le-3 

- 

- 

- 

- 


NMF algorithm in (H as a baseline for comparison with our method. 
Table Q] displays the results obtained by MVC-NMF and our pro¬ 
posed R-CoNMF algorithm for all considered performance discrim¬ 
inators for different values of q — p = {4, 6,8,10,15}. In all cases, 
we considered an SNR of 30 dB and reported the results obtained 
from averaging the results of 30 Monte Carlo runs. 

From the results reported on Table [l] we can make the following 
observations. First and foremost, when there are only a few end- 
members existing in the image (say, q = p = 4), both R-CoNMF 
and MVC-MNF obtained very good results. This is expected, since 
in this case it is relatively easy to solve the optimization problem. 
It is interesting to observe that, as the number of endmembers in¬ 
creases, the R-CoNMF obtained very good performance (note the 
good performance obtained for the case q — p — 10). Even in a 
very difficult scenario such as q — p = 15, the solution provided 
by R-CoNMF is still useful. It should be noted that, in cases with a 
relatively high number of endmembers (i.e., q = p > 8), the MVC- 
NMF yields useless results. This is because, when the number of 
endmember increases, most pixels are likely to fluctuate around the 
facets, a situation in which minimum volume-based algorithms are 
likely to fail (3). Even in this difficult scenario, in which the MVC- 
NMF could not provide feasible results (labeled as in Table [T}, 
the proposed R-CoNMF was able to provide a reasonable solution. 
Based on this experiment, we can conclude that R-CoNMF is quite 
robust and has no strong constraints related with the quality of the 
analyzed data set. 

For illustrative purposes, Fig. 0 ] shows the signatures estimated 
by R-CoNMF and MVC-NMF. The estimated spectral signatures by 
R-CoNMF are similar to the real ones, while those estimated by 
MVC-NMF are slightly different. Similar observations can be made 
from the difference maps between the real and estimated abundance 
maps, as shown in Fig. [5j where the difference of R-CoNMF is much 
smaller than that of the MVC-NMF. The results in this subsection 
indicate that the R-CoNMF can perform very accurately in the case 
that the number of endmembers is known a priori , i.e. q — p. 


4. CONCLUSIONS AND FUTURE LINES 

In this paper, we propose R-CoNMF, which is robust version of the 
collaborative nonnegative matrix factorization (CoNMF) algorithm, 
which estimates the number of endmembers, the mixing matrix, and 
the corresponding abundances. The proposed R-CoNMF algorithm 
fills a gap in the hyperspectral unmixing literature as it is one of 
the few algorithms that can accomplish the three main steps of the 
unmixing chain in fully automatic fashion, without the need to resort 
to external algorithms. In the future, we will evaluate the algorithm 
using real hyperspectral data sets. 
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